Built by aligning high-quality genomes, saved as paths through the pangenome.
Human Pangenome Reference Consortium (HPRC)
Liao, Asri, Ebler, et al. Nature 2023
A snarl is a subgraph bounded by two node sides that are:
A snarl is a subgraph bounded by two node sides that are:
A snarl is a subgraph bounded by two node sides that are:
A snarl is a subgraph bounded by two node sides that are:
A run of consecutive snarls and nodes is called a chain.
Snarls and chains can be nested inside of each other.
The nested relationship of snarls and chains is described by the snarl tree.
Netgraphs are a representation of snarls with their child chains collapsed into a single node
vg deconstruct)vcf + graph + decomposition for this graph
vg deconstruct)vcf + graph + decomposition for this graph
vg deconstruct)vcf + graph + decomposition for this graph
vg deconstruct)Trick for getting this snarl decomposition to look better (currently only for the distance index):
vg index -j [graph.dist] -w 6
vcf + graph + decomposition for this graph
vg giraffeShort reads
Long reads
On the HPRC v2 graph which is x size?
vg graph formats and indexesIndexes
.gbwt (Graph Burrows Wheeler
Transform): haplotype paths.gg (GBWT Graph): node sequences for a
GBWT.dist (Distance Index): snarl
decomposition plus minimum distances.zipcodes: per-node distance
information used by vg giraffe.min (Minimizer Index): minimizers
used by vg giraffe.gcsa (Generalized Compressed Suffix
Array): substring index used by vg map and
vg mpmapGraphs
.gbz (GBWT + GG): the graph induced by
the GBWT.hg (/.vg) (HashGraph):
graph format optimized for speed.pg (/.vg) (PackedGraph):
graph format optimized for space efficiency.xg: older graph format.vg: protobuf-based graph formatvg wiki
vg manpage: https://github.com/vgteam/vg/wiki/vg-manpage
snarls paper doi: 10.1089/cmb.2017.0251
short read giraffe paper doi: 10.1126/science.abg8871
long read giraffe paper doi: 10.1101/2025.09.29.678807